Sains Malaysiana 53(11)(2024): 3817-3829
http://doi.org/10.17576/jsm-2024-5311-23
Modelling
Malaysia Air Quality Data using Bayesian Structural Time Series Models
(Memodelkan Data Kualiti Udara Malaysia menggunakan Model Siri Masa Berstruktur Bayesian)
AESHAH
MOHAMMED1,2, MOHD AFTAR ABU BAKAR1,*,
MAHAYAUDIN M. MANSOR3 & NORATIQAH MOHD ARIFF1
1Department of Mathematical Sciences,
Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi,
Selangor, Malaysia
2Faculty of Science, University of Benghazi, AL Marje,
Libya
3School of Mathematical Sciences, College of Computing, Informatics and
Mathematics, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia
Diserahkan: 9 Februari 2024/Diterima: 27 September 2024
Abstract
Air
pollution poses a significant threat to human health and the environment,
especially in developing nations facing rapid industrialization, urbanization,
and increased vehicle emissions. As cities and factories continue to grow, the air
quality problem worsens, making it crucial to enhance the monitoring, testing,
and forecasting of air quality. In this context, this study focuses on building
air quality models using Bayesian Structural Time Series (BSTS) models to
predict air quality levels in Malaysia. The BSTS model integrates three main
techniques: The structural model, which employs the Kalman filter approach to
model trend and seasonality components; spike and slab regression for variable
selection; and Bayesian model averaging to estimate the best-performing
prediction model while accounting for uncertainty. The study utilized air
quality time-series data spanning two years, from June 2017 to July 2019,
obtained from the Malaysian Department of Environment (DOE). The primary objective
of this study was to forecast air quality and assess the effectiveness of the
Bayesian structural time series analysis on air quality time-series data. The
results indicated that the BSTS technique is capable of modeling air quality time-series data with high accuracy, effectively capturing seasonal
and trend components. The seasonal component showed a repetition of weekly
concentration patterns, while the local linear trend component showed a steady
decline in PM10 and PM2.5 concentration levels in most
stations. Regression analysis demonstrated that humidity and ambient
temperature significantly affected air quality in most locations in Malaysia.
Keywords: Air
quality; Bayesian Structural Time Series; Monte Carlo Markov Chain (MCMC);
spike and slab regression
Abstrak
Pencemaran udara menimbulkan ancaman besar kepada kesihatan manusia dan alam sekitar, terutamanya di negara membangun yang menghadapi perindustrian pesat, pembandaran dan peningkatan pelepasan kenderaan. Perkembangan bandar dan pertambahan kilang mengakibatkan masalah kualiti udara bertambah buruk, menjadikan pentingnya pemantauan, ujian dan ramalan kualiti udara. Dalam konteks ini, kajian ini tertumpu kepada pembinaan model kualiti udara menggunakan model Siri Masa Berstruktur Bayesian (BSTS) untuk meramalkan tahap kualiti udara di Malaysia. Model
BSTS menyepadukan tiga teknik utama: Model struktur yang menggunakan pendekatan penapis Kalman untuk memodelkan komponen trend dan bermusim; regresi spike dan papak untuk pemilihan berubah; dan model Bayesian secara purata untuk menganggarkan model ramalan berprestasi terbaik sambil mengambil kira ketidakpastian. Kajian itu menggunakan data siri masa kualiti udara yang menjangkau dua tahun dari Jun 2017 hingga Julai 2019
yang diperoleh daripada Jabatan Alam Sekitar Malaysia (JAS). Objektif utama kajian ini adalah untuk meramal kualiti udara dan menilai keberkesanan analisis BSTS terhadap data siri masa kualiti udara. Keputusan menunjukkan bahawa teknik BSTS mampu memodelkan data siri masa kualiti udara dengan ketepatan yang tinggi, menangkap komponen bermusim dan trend dengan berkesan. Komponen bermusim menunjukkan pengulangan corak kepekatan mingguan, manakala komponen aliran linear tempatan menunjukkan penurunan yang stabil dalam tahap kepekatan PM10 dan PM2.5 di kebanyakan stesen. Analisis regresi menunjukkan bahawa kelembapan dan suhu ambien menjejaskan kualiti udara dengan ketara di kebanyakan lokasi di Malaysia.
Kata kunci: Kualiti udara; Rantaian Markov Monte
Carlo (MCMC); regresi pepaku dan papak; Siri Masa Berstruktur Bayesian
RUJUKAN
Almarashi, A.M. & Khan, K. 2020. Bayesian
structural time series. Nanoscience and Nanotechnology Letters 12(1): 54-61.
Ariff, N.M., Bakar, M.A.A. & Lim, H.Y. 2023.
Prediction of PM10 concentration in Malaysia using k-means
clustering and LSTM hybrid model. Atmosphere 14(5): 853.
Bakar, M.A.A., Ariff,
N.M., Bakar, S.A., Chi, G.P. & Rajendran, R. 2022. Peramalan kualiti udara menggunakan kaedah pembelajaran mendalam rangkaian perlingkaran temporal
(TCN). Sains Malaysiana 51(8): 2645-2654.
Bakar, M.A.A., Mohd Ariff, N.M., Mohd Nadzir, M.S., Wen, O.L. & Suris,
F.N.A. 2022. Prediction of multivariate air quality time series data using long
short-term memory network. Malaysian Journal of Fundamental and Applied
Sciences 18(1): 52-59.
Brodersen, K.H., Gallusser,
F., Koehler, J., Remy, N. & Scott, S.L. 2015. Inferring causal impact using bayesian structural time-series models. The Annals
of Applied Statistics 9(1): 247-274.
Durbin, J. & Koopman, S.J. 2002. A
simple and efficient simulation smoother for state space time series analysis. Biometrika 89(3): 603-615.
George, E. & McCulloch, R. 1997.
Approaches for Bayesian variable selection. Statistica Sinica 7(2): 339-373.
Jun, S. 2019. Bayesian structural time
series and regression modeling for sustainable
technology management. Sustainability (Switzerland) 11(18): 4945.
Kalman, R.E. 1960. A new approach to linear
filtering and prediction problems. Journal of Fluids Engineering, Transactions
of the ASME 82(1): 35-45.
Madigan, D. & Raftery, A.E. 1994. Model
selection and accounting for model uncertainty in graphical models using occam’s window. Journal of the American Statistical
Association 89(428): 1535-1546.
Mokilane, P., Debba, P., Yadavalli, V. & Sigauke, C.
2019. Bayesian structural time-series approach to a long-term electricity
demand forecasting. Applied Mathematics and Information Sciences 13:
189-199.
Mun, C.K., Abd Rahman, N.H. & Che Ilias, I.S. 2022. Performance of Levenberg-Marquardt neural
network algorithm in air quality forecasting. Sains Malaysiana 51(8): 2645-2654.
Nasr Ahmed AL-Dhurafi, Nurulkamal Masseran & Zamira Hasanah Zamzuri. 2018. Compositional time series analysis for air
pollution index data. Stochastic Environmental Research and Risk Assessment 32(10): 2903-2911. https://doi.org/10.1007/s00477-018-1542-0
Nurulkamal Masseran &
Muhammad Aslam Mohd Safari. 2020. Modeling the transition behaviors of PM10 pollution
index. Environmental Monitoring and Assessment 192: 441. https://api.semanticscholar.org/CorpusID:219729578
Nurul Nnadiah Zakaria, Mahmod Othman, Rajalingam Sokkalingam, Hanita Daud, Lazim Abdullah & Evizal Abdul Kadir. 2019. Markov chain model development
for forecasting air pollution index of Miri, Sarawak. Sustainability
(Switzerland) 11(19): 5190. https://doi.org/10.3390/su11195190
Scott, S.L. & Varian, H.R. 2014.
Predicting the present with Bayesian structural time series. International
Journal of Mathematical Modelling and Numerical Optimisation 5(1-2): 4-23.
Volinsky, C.T., Raftery, A.E., Madigan, D. & Hoeting, J.A. 1999. David Draper and E.I. George, and a
rejoinder by the authors. Statistical Science 14(4): 382-417.
Wen, Z., Ma, X., Xu, W., Si, R., Liu, L.,
Ma, M., Zhao, Y., Tang, A., Zhang, Y., Wang, K., Zhang, Y., Shen, J., Zhang,
L., Zhao, Y., Zhang, F., Goulding, K. & Liu, X. 2024. Combined short-term
and long-term emission controls improve air quality sustainably in China. Nature
Communications 15(1): 5169.
Zellner, A. 1986. On assessing prior
distributions and Bayesian regression analysis with g-prior distributions. In Bayesian
Inference and Decision Techniques: Essays in Honor of
Bruno de Finetti. Stud. Bayesian Econometrics
Statist, edited by Goel, P.K. & Zellner, A. North-Holland Publishing
Co., Amsterdam. 6: 233-243.
Zheng, Y., Ooi,
M.C.G., Juneng, L., Wee, H.B., Latif, M.T., Nadzir, M.S.M., Hanif, N.M., Chan, A., Li, L., Ahmad, N.
& Tangang, F. 2023. Assessing the impacts of
climate variables on long-term air quality trends in Peninsular Malaysia. Science
of The Total Environment 901: 166430.
*Pengarang untuk surat-menyurat;
email: aftar@ukm.edu.my